Accepted for publication in Microprocessors and Microsystems.

ثبت نشده
چکیده

Introduction Recent advances in VLSI technology have made it possible to use large enough caches to eliminate most of the \conventional" cache misses resulting from limited cache size and/or set associativity. As a result, other \non-conventional" cache misses that used to be obscured by the more frequent conventional cache misses are becoming increasingly dominant in the makeup of the total cache misses. In order to reduce these non-conventional cache misses, there have been attempts to characterize realistic workloads and identify the sources of these cache misses. An example of such attempts is the so-called \cache aanity scheduling" 1?5 that aims at reducing cache misses due to context switching. Another example of cache optimization for non-conventional misses is in the case of dynamic heap allocation. Functional programming languages such as LISP make extensive use of the dynamic heap. Peng and Sohi reported a signiicant performance improvement by characterizing the behavior of heap references and optimizing their cache performance 6. This paper focuses on another source of non-conventional cache misses that occur during page clearing in a virtual memory system. In a virtual memory system, when a physical page is remapped to a new virtual page, the previous contents of the physical page must be cleared by overwriting the whole page with zeros for security reasons. This results in back-to-back write accesses to all the blocks in the page. These back-to-back write accesses not only cause cache misses for blocks not in the cache, but also pollute the cache with 1 blocks that are not immediately needed. The results given in published literature 7?9 show that the performance degradation resulting from block memory operations, of which the page clearing operation is an instance, is quite signiicant. In this paper, we propose a lazy (on-demand) page clearing scheme that delays actual clearing until the blocks of the cleared page are actually accessed. When the block is actually accessed, it is cleared in-cache by using a hardware zero register, thus eliminating costly main memory access. The rest of this paper is organized as follows. The next section presents a brief review of related works. We next describes in detail the proposed on-demand, in-cache clearing scheme and analyzes qualitatively the cache overheads related to page clearing. In the following section, we describe the simulator and the traces used to assess the performance improvement ooered by the proposed scheme. The following section presents the results from …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998